NIST-ARPA Interagency Agreement: Human Language Technology Program

نویسنده

  • David S. Paltett
چکیده

PROJECT GOALS 1. To coordinate the design, development and distribution of speech and natural language corpora for the ARPA Spoken Language research community, and the use of these corpora for technology development and evaluation. 2. To design, coordinate the implementation of, and analyze the results of performance assessment benchmark tests for ARPA's speech recognition and spoken language understanding systems. 1. Participated, with SRI International, in annotation and "bug fixes" for the ATIS MADCOW-colIected corpora. 2. Installed BBN-developed and SRI-developed ATIS technology at NIST, and used this data to collect test and training data using subjects recruited from the Gaithersburg, MD area. 3. Produced speech corpora on recordable and pressed CD-ROM media in collaboration with the Linguistic Data Consortium. 4. Participated in discussions regarding implementation of the Semantic Evaluation (SemEval) glass box test protocols. 5. Prepared for, and implemented benchmark tests for the Wall Street Journal-based Continuous Speech Recognition (WSJ-CSR) corpus using the Hub-and-Spoke test paradigm and for the 46-city ATIS corpus. 2. Participate in the ATIS SemEval effort, probably including the development of detailed test and reporting protocols for a "dry run" of an ATIS SemEval test. 3. Collect additional ATIS data at NIST as appropriate. 4. Continue to participate in the development of improved speech transcription and scoring procedures, in ATIS principles of Interpretation documents, and in cooperation with the annotators at SRI, in "bug-report adjudication". 5. Review the use of phonologically-motivated string alignment software for use in scoring speech recognition system output. 6. Prepare for and implement benchmark tests in the WSJ-CSR and ATIS domains in the November 1994 time frame. 7. Participate in the endeavors of the CCCC and MADCOW communities... PLANS 1. Continue to collaborate with the LDC, its data collection and annotation contractors, and the MADCOW community with regard to data collection, annotation, screening and quality control procedures, and (as appropriate), to produce CD-ROMs for early release within the community of test participants.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the 1994 ARPA Human Language Technology Workshop

This volume presents papers, session summaries, and project summaries from the Second ARPA Human Language Technology Workshop, which was held at the Merrill Lynch Conference Center in Plainsboro, NJ, March 8-11, 1994. The Workshop was actually the seventh in a series of ARPA workshops which began in 1988; the first five were called the DARPA Speech and Natural Language Workshops, and the name w...

متن کامل

NIST-DARPA Interagency Agreement: SLS Program

[1] Prepared the CD-ROM versions of the Resource Management (PSI ) and Extended Resource Management (RM2) Corpora (a total of six discs) for use within the DARPA speech research community and for public distribution through the National Technical Information Service (NTIS). These discs include the scoring software and statistical significance tools for benchmark performance assessment tests.

متن کامل

NIST Technical Note 1834 Development of a Test Method to Determine Carbon Monoxide Emission Rates from Portable Generators

The report titled, “Technical Note 1834: Development of a Test Method to Determine Carbon Monoxide Emission Rates from Portable Generators,” presents a test method developed by the National Institute of Standards and Technology (NIST) for determining carbon monoxide (CO) emission rates from portable generators while operating in an enclosed space. NIST developed the test method under an interag...

متن کامل

Semantic Evaluation for Spoken-Language Systems

Development has begun on a semantic evaluation (SemEval) methodology and infrastructure for the ARPA Spoken Language Program. SemEval is an attempt to define a task-independent technology-based evaluation for languageunderstanding systems consisting of three parts: word-sense identification, predicate-argument structure determination, and identification of coreference relations. An initial spok...

متن کامل

A Stewart Platform Lunar Rover

A lunar version of the Robocrane is being developed at the Robot Systems Division of the National Institute of Standards and Technology (NIST) to address the needs of NASA researchers. The NIST robot, called ROBOCRANE, has three-pairs of rigged legs instead of actuators to create a gantry. The legs are joined in a Stewart Platform configuration providing a means for canceling out rotations norm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994